Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach

نویسندگان

چکیده

AbstractDeep neural network quantization with adaptive bitwidths has gained increasing attention due to the ease of model deployment on various platforms different resource budgets. In this paper, we propose a meta-learning approach achieve goal. Specifically, MEBQAT, simple yet effective way bitwidth-adaptive quantization-aware training (QAT) where is effectively combined QAT by redefining tasks incorporate bitwidths. After being deployed platform, MEBQAT allows (meta-)trained be quantized any candidate bitwidth minimal inference accuracy drop. Moreover, in few-shot learning scenario, can also adapt as well unseen target classes adding conventional optimization or metric-based meta-learning. We design variants support both (1) scenario and (2) new are jointly adapted. Our experiments show that merging into results remarkable performance improvement: 98.7% less storage cost compared bitwidth-dedicated 94.7% back propagation bitwidth-only adaptation scenarios, while improving classification up 63.6% vanilla bitwidth-class joint scenarios.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Meta-learning approach to neural network optimization

Optimization of neural network topology, weights and neuron transfer functions for given data set and problem is not an easy task. In this article, we focus primarily on building optimal feed-forward neural network classifier for i.i.d. data sets. We apply meta-learning principles to the neural network structure and function optimization. We show that diversity promotion, ensembling, self-organ...

متن کامل

Adaptive Quantization for Deep Neural Network

In recent years Deep Neural Networks (DNNs) have been rapidly developed in various applications, together with increasingly complex architectures. The performance gain of these DNNs generally comes with high computational costs and large memory consumption, which may not be affordable for mobile platforms. Deep model quantization can be used for reducing the computation and memory costs of DNNs...

متن کامل

DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients

We propose DoReFa-Net, a method to train convolutional neural networks that have low bitwidth weights and activations using low bitwidth parameter gradients. In particular, during backward pass, parameter gradients are stochastically quantized to low bitwidth numbers before being propagated to convolutional layers. As convolutions during forward/backward passes can now operate on low bitwidth w...

متن کامل

Deep Neural Network Capacity

In recent years, deep neural network exhibits its powerful superiority on information discrimination in many computer vision applications. However, the capacity of deep neural network architecture is still a mystery to the researchers. Intuitively, larger capacity of neural network can always deposit more information to improve the discrimination ability of the model. But, the learnable paramet...

متن کامل

Centroid neural network adaptive resonance theory for vector quantization

In this paper, a novel unsupervised competitive learning algorithm, called the centroid neural network adaptive resonance theory (CNN-ART) algorithm, is proposed to relieve the dependence on the initial codewords of the codebook in contrast to the conventional algorithms with vector quantization in lossy image compression. The design of the CNN-ART algorithm is mainly based on the adaptive reso...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2022

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-031-19775-8_13